stieltje transform
Spectral Estimation with Free Decompression
Computing eigenvalues of very large matrices is a critical task in many machine learning applications, including the evaluation of log-determinants, the trace of matrix functions, and other important metrics. As datasets continue to grow in scale, the corresponding covariance and kernel matrices become increasingly large, often reaching magnitudes that make their direct formation impractical or impossible. Existing techniques typically rely on matrix-vector products, which can provide efficient approximations, if the matrix spectrum behaves well. However, in settings like distributed learning, or when the matrix is defined only indirectly, access to the full data set can be restricted to only very small sub-matrices of the original matrix. In these cases, the matrix of nominal interest is not even available as an implicit operator, meaning that even matrix-vector products may not be available. In such settings, the matrix is "impalpable," in the sense that we have access to only masked snapshots of it. We draw on principles from free probability theory to introduce a novel method of "free decompression" to estimate the spectrum of such matrices. Our method can be used to extrapolate from the empirical spectral densities of small submatrices to infer the eigenspectrum of extremely large (impalpable) matrices (that we cannot form or even evaluate with full matrix-vector products). We demonstrate the effectiveness of this approach through a series of examples, comparing its performance against known limiting distributions from random matrix theory in synthetic settings, as well as applying it to submatrices of real-world datasets, matching them with their full empirical eigenspectra.
Free Decompression with Algebraic Spectral Curves
Ameli, Siavash, van der Heide, Chris, Hodgkinson, Liam, Mahoney, Michael W.
At the core of scientific computing and much of modern machine learning (ML) lies the challenge of estimating the eigenvalues of high-dimensional Hermitian matrices. Such matrices, including kernels, Hessians, and graph representations, encode the intrinsic geometry and connectivity of the data and models built on them, rendering the pursuit of efficient spectral techniques a primary concern for both theory and practice. Studying eigenspectra has become a prominent approach to understanding performance and guiding training in deep learning [10, 20, 36, 53]. In many cases, the spectra of such matrices have non-trivial structure, often containing spikes, multiple multi-modal bulks, and heavy-tails [14, 25]. Conventional algorithms to extract eigenvalue information from these matrices have required that the data are able to be stored in memory, scratch space, or can at least be accessed as an implicit operator (via matrix-vector products). More recently, a new class of algorithms has emerged that is able to provide highly-accurate estimates of the eigenvalues (or summary functionals thereof [2]) of matrices, even without implicit or explicit access to the full matrix, i.e., of so-called impalpable matrices [1]. One such method, termed Free Decompression (FD), shows great promise as a tool for gaining access to the spectral distributions of such impalpable matrices. The central premise is that by appropriately sampling a small sub-matrix from the large impalpable matrix of interest, one can evolve a partial differential equation (PDE) in the Stieltjes transform of a spectral density in the decompression ratio to the desired matrix dimension.
High-Dimensional Partial Least Squares: Spectral Analysis and Fundamental Limitations
Lรฉger, Victor, Chatelain, Florent
Partial Least Squares (PLS) is a widely used method for data integration, designed to extract latent components shared across paired high-dimensional datasets. Despite decades of practical success, a precise theoretical understanding of its behavior in high-dimensional regimes remains limited. In this paper, we study a data integration model in which two high-dimensional data matrices share a low-rank common latent structure while also containing individual-specific components. We analyze the singular vectors of the associated cross-covariance matrix using tools from random matrix theory and derive asymptotic characterizations of the alignment between estimated and true latent directions. These results provide a quantitative explanation of the reconstruction performance of the PLS variant based on Singular Value Decomposition (PLS-SVD) and identify regimes where the method exhibits counter-intuitive or limiting behavior. Building on this analysis, we compare PLS-SVD with principal component analysis applied separately to each dataset and show its asymptotic superiority in detecting the common latent subspace. Overall, our results offer a comprehensive theoretical understanding of high-dimensional PLS-SVD, clarifying both its advantages and fundamental limitations.
Asymptotic behavior of eigenvalues of large rank perturbations of large random matrices
Afanasiev, Ievgenii, Berlyand, Leonid, Kiyashko, Mariia
Random Matrix Theory (RMT) is a classical theory that has been developing for more than 70 years. Initially, RMT arose from problems in nuclear physics and found its applications in mathematics, physics, finance, and many other disciplines. Recently, new problems have been arising from the area of Machine Learning. Indeed, often the weight matrices of Deep Neural Networks (DNNs) are initialized randomly. Moreover, modern DNNs have large weight matrices, which is why their spectral properties can be described by asymptotic behavior of N N random matrices as N goes to infinity.